New Marginal Spectrum Feature Information Views of Humpback Whale Vocalization Signals Using the EMD Analysis Methods

Marginal spectrum (MS) feature information of humpback whale vocalization (HWV) signals is an interesting and significant research topic. Empirical mode decomposition (EMD) is a powerful time–frequency analysis tool for marine mammal vocalizations. In this paper, new MS feature innovation information of HWV signals was extracted using the EMD analysis method. Thirty-six HWV samples with a time duration of 17.2 ms were classified into Classes I, II, and III, which consisted of 15, 5, and 16 samples, respectively. The following ratios were evaluated: the average energy ratios of the 1 first intrinsic mode function (IMF1) and residual function (RF) to the referred total energy for the Class I samples; the average energy ratios of the IMF1, 2nd IMF (IMF2), and RF to the referred total energy for the Class II samples; the average energy ratios of the IMF1, 6th IMF (IMF6), and RF to the referred total energy for the Class III samples. These average energy ratios were all more than 10%. The average energy ratios of IMF1 to the referred total energy were 9.825%, 13.790%, 4.938%, 3.977%, and 3.32% in the 2980–3725, 3725–4470, 4470–5215, 10,430–11,175, and 11,175–11,920 Hz bands, respectively, in the Class I samples; 14.675% and 4.910% in the 745–1490 and 1490–2235 Hz bands, respectively, in the Class II samples; 12.0640%, 6.8850%, and 4.1040% in the 2980–3725, 3725–4470, and 11,175–11,920 Hz bands, respectively, in the Class III samples. The results of this study provide a better understanding, high resolution, and new innovative views on the information obtained from the MS features of the HWV signals.


The Related References
Hilbert-Huang transformation (HHT) time-frequency (TF) analysis is an interesting research topic. It is an empirical method that can be used to analyze non-stationary and non-linear signals [1,2]. Moreover, it is an adaptive signal-analysis scheme, which implies that the definition of the basis is signal-dependent. Further, the HHT-TF analysis scheme has revealed physical concepts in several instances of signal analyses. The aim of signal TF analysis is to extract information from the signal to demonstrate the underlying mechanisms, structures, and actual behaviors of various physical phenomena [3]. The nonstationary and non-linear characteristics of a signal with short durations can be extracted using the HHT technique. The original motivation for HHT technique development was to (DSCV) for KWs in the North Pacific. High-frequency modulated signals were recorded with a frequency range of 15.7-21.6 kHz, a 10 dB bandwidth of 5.9 kHz, and an analysis time of 65.2 ms. An earlier report used summed auto-correlation and Fourier transform frequency analysis methods to measure the pulse rate and peak frequency for Southeast Pacific blue whale (BW) song types [19]. The peak frequency of these s vocalizations was approximately 32 Hz.
Frazer et al. [20] found the frequency range of humpback whale vocalization (HWV) to be from 20 Hz to 8 kHz, whereas Au et al. [21] reported that the high-frequency harmonics of HWV songs extended beyond 24 kHz. Mature male HWVs produce elaborate acoustics in low-frequency bands of 0-1.5 kHz. Daily root-mean-squared sound pressure levels can be calculated to compare variations in low-frequency acoustic energy and monitor the population of HWV [22]. Male HWV presents long, structured sequences of acoustic vocalization, and the frequency distributions of the mean pulse-repetition rate can reach 3.97 kHz. Male HWV songs have a minimum frequency below 400 Hz and a maximum frequency above 3 kHz but below 8 kHz [23]. The songs are loud and of long duration; they are produced in the frequency range of 8 Hz-8 kHz, last from several minutes to hours, and have been noted to be frequency and amplitude modulated [24]. Angela et al. [25] examined the non-song vocalizations of HWV frequencies, which vary from 9 Hz to 6 kHz. The frequencies of the majority of vocalizations were under 200 Hz, and the duration of non-song vocalizations was found to be between 0.09 and 3.59 s. Table 1 lists the comparison of features extracted from BFWV, BMSWV DP, BBW ES, KW FMTUC, BOWV, KW DSCV, BW song, HWVs, HWV, HWV songs, and HWV non-song. The Fourier method is often used to analyze HWVs. The energy in the frequency domain is a representation of the original HWVs. However, the Fourier method is not used to analyze nonlinearity and non-stationarity signals. The EMD-based analysis method [9] can be used to distribute the energy over the space-time-frequency space as a representation of the HWVs. Several IMFs and one RF signal (spaces) are generated in parallel and analyzed using the EMD method. The analysis method can be adopted to non-linearity and nonstationarity signals. It is suitable for extracting features from HWVs. The aims of this study are to provide a better understanding, high resolutions, and new perspectives regarding the MS features innovation information contained in HWV signals. Reyes et al. [18] Malige et al. [19] Frazer et al. [20] Au et al. [21] Kugler et al. [ Zhang et al. [26] developed a 3D spatial and spectral-aware convolution module in which the spatial and spectral features of the target spectrum were extracted using 3D convolution. The spatial and texture features were extracted using a 2D convolution module with channel and spatial attention. A hyperspectral image dataset containing 1200 samples taken from ten corn varieties was constructed. The nondestructive identification of corn seeds was demonstrated using a hyperspectral image.
The remainder of this paper is organized as follows. Section 1.2 demonstrates the related EMD-based MS analysis method. Section 2 presents the humpback whale vocalization (HWV) samples. Section 3 presents the analysis results. The discussions and concluding remarks are presented in Sections 4 and 5.

The Related EMD-Based MS Analysis Method [9]
Lin et al. [9] proposed the EMD-based MS analysis method with application to the extraction of the new MS feature information views for HWV samples. The HWV samples, denoted as hwv(t), were adaptively decomposed into N IMFs and one RF using empirical mode decomposition (EMD), as follows: where I MF hwvi (t) and rf (t) are the i-th IMF and RF of the HWV samples, respectively. The referred total energy of hwv(t) is provided by The energy ratio of the i-th IMF to the referred total energy of hwv(t), I MFRE hwvi , is defined [9] as The energy ratio of the RF to the referred total energy of hwv(t), RFRE hwv , is defined [9] as where z hwvi (t) is expressed as In Equation (5), A hwvi (t) and ϕ hwvi (t) are the amplitude and the phase of z hwvi (t), respectively, and are provided by Here, HT{ } is the Hilbert transform.
The i-th IF of the HWV sample, IFhwvi(t), is provided by The marginal spectrum (MS) of I MF hwvi (t) in the m-n kHz band [6], i.e., MSRE hwvimn is calculated as where I MF 2 hwvimn (t) is the energy of I MF hwvi (t) in the m-n kHz band. The MS of RF hwv (t) in the m-n kHz band, MSRFRE hwvmn , is calculated as where r f 2 hwvmn (t) is the energy of r f (t) in the m-n kHz band.

Humpback Whale Vocalizations
An HWV sample (Recording No. 9220100Q) was downloaded from the Watkins Marine Mammal Sound Database [27] (https://cis.whoi.edu/science/B/whalesounds/ index.cfm (accessed on 10 August 2023)) with an HWV number. The HWV was recorded in the sea area around the British Virgin Islands (18 • N, 64 • W) at a water depth of 15 m. Figure 1 shows the full HWV, which is 5.7 s long; the sampling frequency of the vocalization was 14,900 Hz. Thirty-six HWV samples, 17.2 ms in duration, were extracted from the full 9220100Q recording. These samples were categorized into Classes I, II, and III, which contained 15, 5, and 16 samples, respectively. The time-analysis resolution was 17.2 ms so the time-analysis resolution would be high. We evaluated the average energy ratios of the IMF1 and RF to the referred total energy for the Class I samples, the average energy ratios of the IMF1, 2nd IMF (IMF2), and RF to the referred total energy for the Class II samples, and the average energy ratios of the IMF1, 6th IMF (IMF6), and RF to the referred total energy for the Class III samples. All of these energy ratios were larger than 10%. The classification strategies were discussed as follows. The energy ratios of the IMF1 and RF and to the referred total energy were larger than 40% and 20%, respectively, for every Class I sample. The energy ratios of the IMF1 + IMF2 and RF and to the referred total energy of were larger than 55% and 15%, respectively, for every Class II sample. The energy ratios of the IMF1 + IMF6 and RF and the referred total energy were larger than 40% and 55%, respectively, for every Class III sample. Figures 2-4 show the start and end times of Class I, II, and III HWV samples, which provides a visual sense of these samples. As shown in Figure 2

Analysis Results
In this section, the HWV samples were adaptively decomposed into six IMFs and one RF. One original HWV sample in the time domain was expanded to six IMFs and one RF in the time domain; higher resolution TF signal analysis for the HWV samples could be achieved. The wave structure and vision insight of the IMFs and RF for the Class I, II, and III HWV samples were illustrated using the EMD analysis method.
The number of IMFs, for the Class I, II, and III HWV samples were demonstrated. The average instantaneous frequencies (IFs) of IMF1-IMF6 and RF for the Class I, II, and III HWV samples were evaluated. The average energy ratios of the IMF1-IMF6 and RF to the referred total energy and the average energy ratios of IMF1-IMF6 and RF in the several

Analysis Results
In this section, the HWV samples were adaptively decomposed into six IMFs and one RF. One original HWV sample in the time domain was expanded to six IMFs and one RF in the time domain; higher resolution TF signal analysis for the HWV samples could be achieved. The wave structure and vision insight of the IMFs and RF for the Class I, II, and III HWV samples were illustrated using the EMD analysis method.
The number of IMFs, for the Class I, II, and III HWV samples were demonstrated. The average instantaneous frequencies (IFs) of IMF1-IMF6 and RF for the Class I, II, and III HWV samples were evaluated. The average energy ratios of the IMF1-IMF6 and RF to the referred total energy and the average energy ratios of IMF1-IMF6 and RF in the several frequency bands were elaborated. The significant and meaningful feature information views, such as the analysis sample duration, number of IMFs, average energy ratios of the significant IMFs and RF to the referred total energy, and average energy ratios of the significant IMFs and RF in the significant frequency bands to the referred total energy for the Class I, II, and III HWV samples were extracted in detail.

Class II HWV Samples
Five Class II HWV samples were analyzed. Sample 1 from this class was adaptively decomposed into six IMFs and one RF using the EMD method.

Class II HWV Samples
Five Class II HWV samples were analyzed. Sample 1 from this class was adaptively decomposed into six IMFs and one RF using the EMD method.

Class III HWV Samples
Sixteen Class III HWV samples were analyzed. Sample 1 from this class was adap-

Class III HWV Samples
Sixteen Class III HWV samples were analyzed. Sample 1 from this class was adaptively decomposed into six IMFs and one RF using the EMD method. As for the other

Class III HWV Samples
Sixteen Class III HWV samples were analyzed. Sample 1 from this class was adaptively decomposed into six IMFs and one RF using the EMD method. As for the other classes, the number of IMFs depended on the input Class III HWV samples. The mean IFs of IMF1-IMF6, and RF for Class III HWV sample 1 were 5.720, 2.890, 1.445, 0.330, 0.245, 0.141, and 0.038 kHz, respectively. The average mean IFs of IMF-IMF6, and RF for Class II HWVs were 5.422, 3.153, 1.407, 0.475, 0.242, 0.138, and 0.040 Figure 14 shows that the average values of I MFRE hwvi and RFRE hwv for the Class I, II, and III HWV samples were higher than 10%.   for the Class I, II, and III HWV samples, which were larger than 10%. Table 2 lists the HHT-based feature extraction vocalizations of the HWV, SW clicks [9] samples, and blue whale B call vocalizations (BWBCV). The durations of the Class I, II, and III HWV samples were 17.2, 17.2, and 17.2 ms, respectively; those of the Click I and II SW samples were 10 and 5 ms, respectively; the durations of the Class I and II BWBCV samples were 180 and 180 ms, respectively. The numbers of IMFs for the Class I, II, and III HWV samples were 6, 6, and 6, respectively. Those for the Click I and II SW samples were 7 and 6, respectively, and those for the Class I and II BWBCV samples were 5 and 5, respectively. Table 2. Comparison of (a) HHT-based features extracted from HWV, SW click samples, and blue  Table 2 lists the HHT-based feature extraction vocalizations of the HWV, SW clicks [9] samples, and blue whale B call vocalizations (BWBCV). The durations of the Class I, II, and III HWV samples were 17.2, 17.2, and 17.2 ms, respectively; those of the Click I and II SW samples were 10 and 5 ms, respectively; the durations of the Class I and II BWBCV samples were 180 and 180 ms, respectively. The numbers of IMFs for the Class I, II, and III HWV samples were 6, 6, and 6, respectively. Those for the Click I and II SW samples were 7 and 6, respectively, and those for the Class I and II BWBCV samples were 5 and 5, respectively.  The average energy ratios of the IMF1 to the referred total energy for the Class I, II, and III HWV samples were 46.37%, 32.06%, and 34.29%, respectively. The average energy ratios of the IMF1 to the referred total energy for the Click I and II SW samples were 61.50% and 73.33%, respectively. The average energy ratios of the IMF1 to the referred total energy for the Class I and II BWBCV samples were 83.40% and 32.63%, respectively.

Discussion
The average energy ratios of the IMF2, IMF3, and IMF4 to the referred total energy for the Class II BWBCV samples were 32.63%, 37.00%, 11.95%, and 12.07%, respectively. The highest ratio of the average energy of IMF1 to the referred total energy of the Class I BWBCV samples was 83.40%. Additionally, the highest ratio of the average energy of IMF2 to the referred total energy of the Class II BWBCV samples was 37.00%. Finally, the ratio of the average energy of IMF1 to the referred total energy of the Class II BWBCV samples was the second highest at a ratio of 32.63%. The average energy ratio of the IMF2 to the referred total energy for the Class II HWV samples was 29.22%, and the average energy ratios of the IMF2 to the referred total energy for the Click I and II SW samples were 12.41% and 13.89%, respectively. The average energy ratios of the RF to the referred total energy for the Class I, II, and III HWV samples were 34.21%, 22.64%, and 38.33%, respectively. The energy distributions of IMF1 and IMF2 for the Click I and II SW samples were important.  24.08%, in the 10-18 Hz band with a ratio of 28.29%, in the 4-7 Hz band with a ratio of 10.38%, and in the 5-6 Hz band with a ratio of 11.36%. The higher-frequency components of the Class I and II BWBCV samples were 34-52 and 41-52 Hz, respectively. The lower-frequency components of the Class II BWBCV samples were 34-37 Hz.
The higher-frequency components of the Class I, II, and III HWV samples were 3735-4470, 745-1490, and 2980-3725 Hz, respectively, and those of the Click I and II SW samples were 11-15 and 8-15 kHz, respectively. The lower-frequency components of the Class I, II, and III HWV samples were in the range of 14.9-22.35 Hz, and of the Click I and II SW samples were 4-5, and 0-1 kHz, respectively. These results thus reveal the new MS-based energy distribution characteristic views of Class I, II, and III HWV samples.
The average energy ratios of MS1 to the referred total energy in different frequency bands can be added for the proposed Class I HWV samples. The average energy ratios of MS1 in the ranges of 2980-3725 Hz and 3725-4470 Hz to the referred total energy for Class I HWV samples were 9.83% and 13.79%, respectively. The average energy ratio of MS1 in the range of 2980-4470 Hz to the referred total energy for the Class I HWV samples was 23.62%. The average energy ratio of MS1 in the range of 2980-4470 Hz to the referred total energy for Class III HWV samples was 18.95%.
The average energy ratios of MS1 and MS2 to the referred total energy in the same frequency bands can be added for the proposed Class II HWV samples. The average energy ratios of MS1 and MS2 to the referred total energy in the range of 745-1490 Hz were 14.68% and 18.99%, respectively. The average energy ratios of MS1 and MS2 to the referred total energy in the range of 745-1490 Hz for the Class II HWV samples were 33.67%. Table 1 shows the comparison of the features extracted from HWVs, HWV, HWV songs, and HWV non-song. Using the Fourier analysis method. Frazer et al. [20], Au et al. [21], Kugler et al. [22], Mercado et al. [23], Bilal et al. [24], and Angela et al. [25] demonstrated the important frequency bands of HWV to be in the ranges of 20-8000 Hz, >21 kHz, 0-1500 Hz, <400 Hz and 3000-8000 Hz, 8-8000 Hz, and 9-6000 Hz, respectively. Table 2

Conclusions
In the paper, 36 HWV samples were classified into Classes I, II, and III, which consisted of 15, 5, and 16 samples, respectively. These samples were decomposed into six IMFs and one RF using the EMD method. The first sample of Class I was illustrated. The average values of I MFRE hwv1 and RFRE hwv for the Class I samples, I MFRE hwv1 , I MFRE hwv2 , and RFRE hwv for the Class II samples, and I MFRE hwv1 , I MFRE hwv6 and RFRE hwv for the Class III samples were all greater than 10%.
The average important energy ratios of IMF1 to the referred total energy for the Class I, II, and III HWV samples were 46.37%, 32.06%, and 34.29%, respectively. The average important energy ratios of RF to the referred total energy for the Class I, II, and III HWV samples were 34.21%, 22.64%, and 38.33%, respectively. The average important energy ratios of MS1 in the high-frequency bands of 2980-3725 and 3725-4470 Hz to the referred total energy for the Class I HWV samples were 9.83% and 13.79%, respectively. The average important energy ratio of the MS1 in the high-frequency band of 745-1490 Hz to the referred total energy for the Class II HWV samples was 14.68%.
The average important energy ratios of the MS1 in high-frequency bands of 2980-3725 and 3725-4470 Hz to the referred total energy for the Class III HWV samples were 12.06% and 6.89%, respectively. The average important energy ratio of the MS2 in the highfrequency band of 745-1490 Hz to the referred total energy for the Class II HWV samples was 18.99%. The average important energy ratio of the MS6 in the low-frequency band of 52. 15-59.60 Hz to the referred total energy for the Class II HWV samples was 10.27%. The average important energy ratios of the MS RF in the low-frequency band of 14.90-22.35 Hz to the referred total energy for the Class I, II, and III HWV samples were 26.99%, 21.63%, and 32.83%, respectively.
The MS characteristics of Class I, II, and III samples in the high and low-frequency bands were revealed. The high time and frequency analytical resolutions of the proposed HHT-based analysis method for HWV samples were 17.2 ms and 7.45 Hz, respectively. High TF analytical resolutions of HWV samples were achieved. The proposed MS-based analytical method for HWV samples is easy to implement using software and hardware, and a short analytical time can be achieved. The results of this paper provide a better understanding of the IMF and MS energy distribution characteristics of HWV samples when HHT-TF analytical methods are used. EMD-based analysis results show that the analysis sample duration, number of IMFs, significant and meaningful IMFs, and significant and meaningful RF, MS1, MS2, MS6, and MS RF new feature information views were revealed.